[CI][Bench] Create summary reports for benchmarking CI run results #19733

ianayl · 2025-08-06T21:23:53Z

Much easier to figure out what's wrong with the benchmarking CI runs when it tells you what's wrong immediately: https://github.com/intel/llvm/actions/runs/16789472825

ianayl · 2025-08-11T18:55:31Z

Test runs, feel free to get an idea of what the summaries look like here:

regression: https://github.com/intel/llvm/actions/runs/16789472825
no regression: https://github.com/intel/llvm/actions/runs/16887540034

sarnex

yaml lgtm

ianayl · 2025-08-13T17:20:38Z

@intel/llvm-reviewers-benchmarking Friendly ping

uditagarwal97 · 2025-08-13T17:25:42Z

devops/scripts/benchmarks/compare.py

+            if args.produce_github_summary:
+                gh_summary.println("### Regressions")
+                gh_summary.println(
+                    f"<details><summary>{len(regressions_ignored)} non CI-failing regressions:</summary>"


Sorry, could you clarify what you meant by non CI-failing regressions?

Hey thanks for taking a look Udit,
The benchmark CI now also runs UR benchmarks, L0 benchmarks, etc.; regressions in e.g. L0 should not cause the nightly benchmarking CI for SYCL to fail, thus they are filtered out and categorized differently.

I see. In that case, should we rename it to "non-SYCL regressions"?

Also, we don't list "regressions" in the summary where delta is less than the noise threshold, correct?

I see. In that case, should we rename it to "non-SYCL regressions"?

I feel like that would be less confusing, but I am also aware that other projects use these series of benchmarking scripts as well (i.e. UMF), so I was hesitant to hardcode "non-SYCL" into the titles/descriptions. In hindsight this should perhaps be a customizable option.

Also, we don't list "regressions" in the summary where delta is less than the noise threshold, correct?

That is correct. Noise is ignored.

devops/scripts/benchmarks/compare.py

ianayl · 2025-08-14T19:44:00Z

Amended PR based on reviewer comments. New test runs:

success (no regression): https://github.com/intel/llvm/actions/runs/16947777946
regression, but with no user-defined regression filter type: https://github.com/intel/llvm/actions/runs/16970106654
regression, but with user-defined regression filter type (in this case it's SYCL): https://github.com/intel/llvm/actions/runs/16974027983

devops/scripts/benchmarks/compare.py

Copilot

Pull Request Overview

Adds summary reports for benchmarking CI runs to improve visibility into benchmark failures. The change creates a GitHub CI summary file that provides a formatted overview of benchmark results, including improvements, filtered regressions, and concerning regressions.

Adds new command-line arguments for producing GitHub summary reports
Enhances the comparison script to generate markdown summaries alongside console output
Updates the benchmark action to create and display the summary in GitHub CI

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
devops/scripts/benchmarks/compare.py	Adds GitHub summary generation functionality with new CLI arguments and markdown formatting for benchmark results
devops/actions/run-tests/benchmark/action.yml	Updates benchmark action to generate summary file and display it in GitHub CI step summary

devops/scripts/benchmarks/compare.py

uditagarwal97 · 2025-08-15T16:04:27Z

Amended PR based on reviewer comments. New test runs:

success (no regression): https://github.com/intel/llvm/actions/runs/16947777946

regression, but with no user-defined regression filter type: https://github.com/intel/llvm/actions/runs/16970106654

regression, but with user-defined regression filter type (in this case it's SYCL): https://github.com/intel/llvm/actions/runs/16974027983

The summary looks good now. 👍🏻

uditagarwal97

LGTM. Thanks.

ianayl · 2025-08-19T19:35:39Z

@intel/llvm-gatekeepers PR is ready for merge, thanks!

CI failure is unrelated and tracked in #19354 (Currently looking into this right now)

ianayl added 5 commits August 6, 2025 09:34

Produce step summaries

0dfe023

add message indicating no regressions

d68d054

push summary later

cfaf3da

Collase not needed segments

2becef0

smaller titles

09cfa77

ianayl temporarily deployed to WindowsCILock August 6, 2025 21:24 — with GitHub Actions Inactive

ianayl had a problem deploying to WindowsCILock August 6, 2025 21:47 — with GitHub Actions Error

whitespace

83ac322

ianayl temporarily deployed to WindowsCILock August 6, 2025 21:52 — with GitHub Actions Inactive

ianayl temporarily deployed to WindowsCILock August 6, 2025 22:20 — with GitHub Actions Inactive

clang-format

4a3e296

ianayl temporarily deployed to WindowsCILock August 8, 2025 14:53 — with GitHub Actions Inactive

ianayl had a problem deploying to WindowsCILock August 8, 2025 15:15 — with GitHub Actions Failure

ianayl temporarily deployed to WindowsCILock August 8, 2025 15:15 — with GitHub Actions Inactive

ianayl had a problem deploying to WindowsCILock August 11, 2025 18:56 — with GitHub Actions Failure

ianayl marked this pull request as ready for review August 11, 2025 18:56

ianayl requested review from a team as code owners August 11, 2025 18:56

ianayl temporarily deployed to WindowsCILock August 11, 2025 18:57 — with GitHub Actions Inactive

sarnex approved these changes Aug 11, 2025

View reviewed changes

ianayl temporarily deployed to WindowsCILock August 11, 2025 19:20 — with GitHub Actions Inactive

uditagarwal97 reviewed Aug 13, 2025

View reviewed changes

devops/scripts/benchmarks/compare.py Outdated Show resolved Hide resolved

uditagarwal97 reviewed Aug 13, 2025

View reviewed changes

devops/scripts/benchmarks/compare.py Outdated Show resolved Hide resolved

uditagarwal97 reviewed Aug 13, 2025

View reviewed changes

devops/scripts/benchmarks/compare.py Outdated Show resolved Hide resolved

ianayl temporarily deployed to WindowsCILock August 13, 2025 19:51 — with GitHub Actions Inactive

ianayl temporarily deployed to WindowsCILock August 13, 2025 20:14 — with GitHub Actions Inactive

clang-format

22c5cb3

ianayl temporarily deployed to WindowsCILock August 14, 2025 15:49 — with GitHub Actions Inactive

ianayl had a problem deploying to WindowsCILock August 14, 2025 16:48 — with GitHub Actions Error

ianayl temporarily deployed to WindowsCILock August 14, 2025 16:48 — with GitHub Actions Inactive

add SYCL as filter

d264d07

ianayl temporarily deployed to WindowsCILock August 14, 2025 18:54 — with GitHub Actions Inactive

ianayl had a problem deploying to WindowsCILock August 14, 2025 19:15 — with GitHub Actions Failure

ianayl temporarily deployed to WindowsCILock August 14, 2025 19:15 — with GitHub Actions Inactive

ianayl temporarily deployed to WindowsCILock August 14, 2025 19:45 — with GitHub Actions Inactive

ianayl requested a review from uditagarwal97 August 15, 2025 15:53

uditagarwal97 reviewed Aug 15, 2025

View reviewed changes

devops/scripts/benchmarks/compare.py Outdated Show resolved Hide resolved

devops/scripts/benchmarks/compare.py Outdated Show resolved Hide resolved

devops/scripts/benchmarks/compare.py Show resolved Hide resolved

devops/scripts/benchmarks/compare.py Outdated Show resolved Hide resolved

uditagarwal97 requested a review from Copilot August 15, 2025 16:02

Copilot AI reviewed Aug 15, 2025

View reviewed changes

devops/scripts/benchmarks/compare.py Outdated Show resolved Hide resolved

devops/scripts/benchmarks/compare.py Show resolved Hide resolved

devops/scripts/benchmarks/compare.py Outdated Show resolved Hide resolved

amend to comments

a2640aa

ianayl temporarily deployed to WindowsCILock August 15, 2025 20:14 — with GitHub Actions Inactive

uditagarwal97 approved these changes Aug 15, 2025

View reviewed changes

ianayl had a problem deploying to WindowsCILock August 15, 2025 20:37 — with GitHub Actions Error

fix bug

6c3dffe

ianayl temporarily deployed to WindowsCILock August 15, 2025 20:41 — with GitHub Actions Inactive

ianayl had a problem deploying to WindowsCILock August 15, 2025 21:03 — with GitHub Actions Failure

ianayl temporarily deployed to WindowsCILock August 15, 2025 21:03 — with GitHub Actions Inactive

uditagarwal97 merged commit 6c7bc5c into sycl Aug 19, 2025
31 of 33 checks passed

uditagarwal97 deleted the ianayl/benchmark-ci-summaries branch August 19, 2025 19:38

[CI][Bench] Create summary reports for benchmarking CI run results #19733

[CI][Bench] Create summary reports for benchmarking CI run results #19733

Uh oh!

Conversation

ianayl commented Aug 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ianayl commented Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sarnex left a comment

Choose a reason for hiding this comment

Uh oh!

ianayl commented Aug 13, 2025

Uh oh!

uditagarwal97 Aug 13, 2025

Choose a reason for hiding this comment

Uh oh!

ianayl Aug 13, 2025

Choose a reason for hiding this comment

Uh oh!

uditagarwal97 Aug 13, 2025

Choose a reason for hiding this comment

Uh oh!

uditagarwal97 Aug 13, 2025

Choose a reason for hiding this comment

Uh oh!

ianayl Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ianayl commented Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

uditagarwal97 commented Aug 15, 2025

Uh oh!

uditagarwal97 left a comment

Choose a reason for hiding this comment

Uh oh!

ianayl commented Aug 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ianayl commented Aug 6, 2025 •

edited

Loading

ianayl commented Aug 11, 2025 •

edited

Loading

ianayl Aug 13, 2025 •

edited

Loading

ianayl commented Aug 14, 2025 •

edited

Loading